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Abstract 

In this work we consider the communication of information in the presence of an online adversarial 
jammer. In the setting under study, a sender wishes to communicate a message to a receiver by trans- 
mitting a codeword x = (zi, . . . ,Xn) symbol-by-symbol over a communication channel. The adversarial 
jammer can view the transmitted symbols Xi one at a time, and can change up to a p- fraction of them. 
However, the decisions of the jammer must be made in an online or causal manner. Namely, for each 
symbol Xi the jammer's decision on whether to corrupt it or not (and on how to change it) must depend 
only on Xj for j < i. This is in contrast to the "classical" adversarial jammer which may base its 
decisions on its complete knowledge of x. More generally, for a delay parameter d G (0, 1), we study the 
scenario in which the jammer's decision on the corruption of Xi must depend solely on Xj for j < i ~ dn. 

In this work, we initiate the study of codes for online adversaries, and present a tight characterization 
of the amount of information one can transmit in both the 0-delay and, more generally, the d-delay 
online setting. We show that for 0-delay adversaries, the achievable rate asymptotically equals that 
of the classical adversarial model. For positive values of d we show that the achievable rate can be 
significantly greater than that of the classical model. 

We prove tight results for both additive and overwrite jammers when the transmitted symbols are 
assumed to be over a sufficiently large field F. In the additive case the jammer may corrupt information 
Xi £ F by adding onto it a corresponding error G F. In this case the receiver gets the symbol 
Ui = Xi + Ci. In the overwrite case, the jammer may corrupt information Xi G ¥ hy replacing it with 
a corresponding corrupted symbol yi G F. For positive delay d, symbol Xi may not be known to the 
adversarial jammer at the time it is being corrupted, hence these two error models, and the corresponding 
achievable rates, are shown to differ substantially. 

Finally, we extend our results to a jam-or-listen online model, where the online adversary can either 
jam a symbol or eavesdrop on it. This corresponds to several scenarios that arise in practice. We again 
provide a tight characterization of the achievable rate for several variants of this model. 

The rate-regions we prove for each model are informational-theoretic in nature and hold for computa- 
tionally unbounded adversaries. The rate regions are characterized by "simple" piecewise linear functions 
of p and d. The codes we construct to attain the optimal rate for each scenario are computationally 
efficient. 
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1 Introduction 



Consider the following adversarial communication scenario. A sender Alice wishes to transmit a message 
u to a receiver Bob. To do so, Alice encodes u into a codeword x and transmits it over a channel. In this 
work the codeword x = xi, . . . , a;„ is considered to be a vector of length n over an alphabet F of size q. 
However, Calvin, a malicious adversary, can observe x and corrupt up to a p-fraction of the n transmitted 
symbols (i.e., pn symbols). 

In the classical adversarial channel model, e.g., [3 [3], it is usually assumed that Calvin has full knowl- 
edge of the entire codeword x, and based on this knowledge (together with the knowledge of the code shared 
by Alice and Bob) Calvin can maliciously plan what error to impose on x. We refer to such an adversary 
as an omniscient adversary. For large values of q (which is the focus of this work) communication in the 
presence of an omniscient adversary is well-understood. It is known that Alice can transmit no more than 
(1 — 2p)n error-free symbols to Bob when using codewords of block length n. Further, efficient schemes 
such as Reed-Solomon codes [1] are known to achieve this optimal rate. 

Online adversaries In this work we initiate the analysis of coding schemes that allow communication 
against certain adversaries that are weaker than the omniscient adversary. We consider adversaries that 
behave in an online manner. Namely, for each symbol Xj, we assume that Calvin decides whether to change 
it or not (and if so, how to change it) based on the symbols Xj, for j < i alone, i.e., the symbols that he 
has already observed. In this case we refer to Calvin as an online adversary. 

Online adversaries arise naturally in practical settings, where adversaries typically have no a priori 
knowledge of Alice's message u. In such cases they must simultaneously learn u based on Alice's trans- 
missions, and jam the corresponding codeword x accordingly. This causality assumption is reasonable for 
many communication channels, both wired and wireless, where Calvin is not co-located with Alice. For 
example consider the scenario in which the transmission of x = xi, . . . , is done during n channel uses 
over time, where at time i the symbol (or packet) Xj is transmitted over the channel. Calvin can only 
corrupt a packet when it is transmitted (and thus its error is based on its view so far). To decode the 
transmitted message. Bob waits until all the packets have arrived. As in the omniscient model, Calvin is 
restricted in the number of packets pn he can corrupt. This might be because of limited processing power, 
limited transmit energy, or a need to keep his location secret. 

In addition to the online adversaries described above, we also consider the more general scenario in 
which Calvin's jamming decisions are delayed. That is, for a delay parameter d G (0, 1), Calvin's decision 
on the corruption of Xj must depend solely on Xj for j < i — dn. We refer to such adversaries as d-delay 
online adversaries. Such d-delay online adversaries correspond, for example, to the scenario in which the 
error transmission of the adversary is delayed due to certain computational tasks that the adversary needs 
to perform. We show that the 0-delay model (i.e., d = 0) and the d-delay model for d > display different 
behaviour, hence we treat them separately. 

Error model We consider two types of attacks by Calvin. An additive attack is one in which Calvin 
can add pn error symbols Ci to Alice's transmitted symbols Xj. Thus yj, the i'th symbol Bob receives, 
equals Xj + ei. Here addition is defined over the finite field ¥q with q elements. An overwrite attack is 
one in which Calvin overwrites pn of Alice's transmitted symbols Xj by the symbols yi received by 
These two attacks are significantly different, if we assume that at the time Calvin is corrupting Xi he has 
no knowledge of its value - this is exactly the positive-delay d scenario. 

The two attacks we study are intended to model different physical models of Calvin's jamming. For 
instance, in wired packet-based channels Calvin can directly replace some transmitted packets Xj with 
some fake packets yi, and therefore behave like an overwriting adversary. On the other hand in wireless 
networks, Bob's received signal is usually a function of both Xj and the additive error e^. 

^ Note that in the 0-delay case these two attacks are equivalent. This is because in both cases Calvin can change an Xi 
into an arbitrary yi; an additive Calvin can choose Ci — yi — Xi, whereas an overwriting Calvin directly uses yi. 
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Lastly we consider the jam-or-listen online adversary. In this scenario, in addition to being an online 
adversary, if Calvin jams a symbol Xi then he has no idea what value it takes. This model is again motivated 
by wireless transmissions, where a node can typically either transmit or receive, but not both. For this 
model, we consider all four combinations of 0-delay/d-delay, and additive/overwrite errors. 

A rate R is said to be achievable against an adversary Calvin if it is possible for Alice to transmit a 
message u of at least Rn symbols of over n channel uses to Bob (with probability of decoding error going 
to zero as n — 5- oo). The capacity, when communicating in the presence of a certain adversarial model, is 
defined to be the supremum of all achievable rates. Thus, the capacity characterizes the rate achievable in 
the adversarial model under study. We denote the capacity of the classical omniscient adversarial channel 
which can change pn characters by C°™^(p). We denote the capacity of the d-delay online adversarial 
channels which can change pn characters by C^'^'^{p) for the additive error model, and C'^{p) for the 
overwrite error model. For the jam-or-listen adversary, we denote the corresponding capacities by 
C^fi^'^^^{p) or Cj"'"'°"(p), depending on whether Calvin uses additive or overwrite errors. A more detailed 
discussion of our definitions and notation is given in Section [21 

Our results In this work, we initiate the study of codes for online adversaries, and present a tight 
characterization of the amount of information one can transmit in both the 0-delay and, more generally, 
the d-delay online setting. To the best of our knowledge, communication in the presence of an online 
adversary (with or without delay) has not been explicitly addressed in the literature. Nevertheless, we 
note that the model of online channels, being a natural one, has been "on the table" for several decades 
and the analysis of the online channel model appears as an open question in the book of Csiszar and Korner 
[1] (in the section addressing Arbitrary Varying Channels [2]). Various variants of causal adversaries have 
been addressed in the past, for instance [21 (H [HI [121 E] ~ however the models considered therein differ 
significantly from ours. 

At a high level, we show that for 0-delay adversaries the achievable rate equals that of the classical 
"omniscient" adversarial model. This may at first come as a surprise, as the online adversary is weaker 
than the omniscient one, and hence one may suspect that it allows a higher rate of communication. We 
then show, for positive values of the delay parameter d, that the achievable rate can be significantly greater 
than those achievable against omniscient adversaries. 

We stress that our results are information-theoretic in nature and thus hold even if the adversary is 
computationally unbounded. The codes we construct to achieve the optimal rates are computationally 
efficient to design, and for Alice and Bob to implement (i.e., efficiently encodable and decodable). All 
our results assume that the field size q is significantly larger than n. In some cases it suffices to take 
q = poly(n), but in others we need q = exp(poly(n)). Both settings lend themselves naturally to real-world 
scenarios, as in both cases a field element Xi can be represented by a polynomial (in n) number of bits. 

The exact statements of our results are in Theorems [H [21 [3] and [H below. The technical parameters 
(including rate, field size, error probability, and time complexity) of our results are summarized in Table [1] 
of the Appendix. We start by showing that in the 0-delay case, the capacity of the online channel equals 
that of the stronger omniscient channel model. 

Theorem 1 (0-delay model) For any p £ [0, 1], communicating against a 0-delay online adversary chan- 
nel under both the overwrite and additive error models equals the capacity under the omniscient model. 
In particular, 

c^^ip) = cf^ip) = c°-^(p) = (1 - 2p)+ = I J - ^ I . (1) 

Moreover, the capacity can be attained by an efficient encoding and decoding scheme. 

Next we characterize the capacity of the d-delay online channel under the additive error model. 
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Theorem 2 (d delay with additive error model) For any p G [0, 1] the capacity C^'^'^{p) of the d- 
delay online channel for d > under the additive error model is 1 — p. Moreover, the capacity can be 
attained by an efficient encoding and decoding scheme. 

We then turn to study the (i-delay onhne channel under the overwrite error model. The capacity we 
present is at least as large as that achievable against an additive or overwrite 0-delay adversary who changes 
pn symbols. However, it is sometimes significantly lower than that achievable against an additive d-delay 
adversary. 

Theorem 3 (d delay with overwrite error model) For any p G [0, 1] the capacity of the d-delay on- 
line channel under the overwrite error model is 



Moreover, the capacity can be attained by an efficient encoding and decoding scheme. 

Lastly, we show that the optimal rates achievable against a jam-or-listen online adversary equal the 
corresponding optimal rates achievable against an online adversary, for each of the four combinations of 0- 
or d-delay, and additive or overwrite attacks. 

Theorem 4 (jam-or-listen model) For any p and d in [0,1] the capacity of the d-delay online channel 
under the jam-or-listen error model is equal to that of the d-delay online channel: 



Moreover, the capacity can be attained by the same efficient encoding and decoding schemes as in Theo- 
rems [nil and 

Outline of proof techniques The proofs of Theorems [H El E] and [4] require obtaining several non- 
trivial upper and lower bounds on the capacity of the corresponding channel models. The lower bounds 
are proved constructively by presenting efficient encoding and decoding schemes operating at the optimal 
rates of communication. The upper bounds are typically proven by presenting strategies for Calvin that 
result in a probability of decoding error that is strictly bounded away from zero regardless of Alice and 
Bob's encoding/decoding schemes. 

Theorem [1] states that communication in the presence of a 0-delay online adversary is no easier than 
communicating in the presence of (the more powerful) omniscient adversary. There already exist efficient 
encoding and decoding schemes that allow communication at the optimal rate of 1 — 2p in the presence of an 
omniscient adversary [101 [T]. Thus our contribution in this scenario is in the design of a strategy for Calvin 
that does not allow communication at a higher rate. The scheme we present is fairly straightforward, and 
allows Calvin to enforce a probability of error of size at least 1/4 whenever Alice and Bob communicate 
at a rate higher than 1 — 2p. Roughly speaking, Calvin uses a two-phase wait and attack strategy. In 
the first phase (whose length depends on p), Calvin does not corrupt the transmitted symbols but merely 
eavesdrops. He is thus able to reduce his ambiguity regarding the codeword x that Alice transmits. In the 
second phase, using the knowledge of x he has gained so far, Calvin designs an error vector to be imposed 
on the remaining part of the codeword that Alice is yet to transmit. 

Theorem [2] states that for d > 0, the capacity of the d-delay online channel under the additive error 
model is 1 — p. Note that this expression is independent of d. In fact, even if Calvin's attack is delayed 
by just a single symbol, the rate of communication achievable between Alice and Bob is strictly greater 
than in the corresponding scenario in Theorem [TJ The upper bound follows directly from the simple 
observation that Calvin can always add pn random symbols from Fg to the first pn symbols of x, and 



' l-p, p£ [0,0.5),p < d 

CTip)={ l-2p + d, pG [0,0.5),p > d 
_ 0, [0.5,1] 



(2) 



(3) 
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therefore the corresponding symbols received carry no information. The lower bound involves a non- 
trivial code construction. In a nutshell, we show a reduction between communicating over the d-delay 
online channel under the additive error model and communicating over an erasure channel. In an erasure 
channel, the receiver Bob is assumed to know which of the pn elements of the transmitted codeword x 
were corrupted by Calvin. As one can efficiently communicate over an erasure channel with rate 1 — p, 
e.g., [2, we obtain the same rate for our online channel. The main question in now: "In our model, how 
can Bob detect that a received symbol Ui was corrupted by Calvin?" The idea is to use authentication 
schemes which are information theoretically secure, and lend themselves to the adversarial setting at hand. 
Namely, each transmitted symbol will include some internal redundancy, a signature, which upon decoding 
will be authenticated. As Calvin is a positive delay adversary, it is assumed that he is unaware of both 
the symbol being transmitted and its signature. It is enough that the signature scheme we construct be 
resilient against such an adversary. 

In Theorem [3] both the lower and upper bound on the capacity require novel constructions. For the 
upper bound we refine the "wait-and attack" strategy for Calvin outlined in the discussion above on 
Theorem[TJ to fit the d-delay scenario. For the lower bound, we change Alice and Bob's encoding/decoding 
schemes, outlined in the discussion above on Theorem [21 to fit the d-delay overwrite model. Namely, as 
before, Alice's encoding scheme comprises of an erasure code along with a hash function used to authenticate 
individual symbols. However, in general, an overwrite adversary is more powerful than an additive 
adversary. This is because an overwriting adversary can substitute any symbol Xi by a new symbol yi. 
Thus Calvin can choose to replace Xi with a symbol yi that is a valid output of the hash function. Hence 
the design of the hash function for Theorem [3] is more intricate than the corresponding construction in 
Theorem [2l 

Roughly speaking, in the scheme we propose for the d-delay overwrite scenario, the redundancy added 
to each symbol Xi contains information that allows pairwise authentication (via a pairwise independent 
hash function). Namely, each symbol Xj contains n signatures Cij (one for each symbol Xj £ x). Using 
these signatures, some pairs of symbols Xi and Xj can be mutually authenticated to check whether exactly 
one of them has been corrupted. (For instance, symbols Xi and xj such that |i — j| < dn can be used for 
mutual authentication, since when Calvin corrupts either one of them he does not yet know the value of 
the other.) This allows Bob to build a consistency graph containing a vertex corresponding to each received 
symbol, and an edge connecting mutually consistent symbols. Bob then analyzes certain combinatorial 
properties of this consistency graph to extract a maximal set of mutually consistent symbols. He finally 
inverts Alice's erasure code to retrieve her message. We view Bob's efficient decoding algorithm as the 
main technical contribution of this work. 

Lastly, Theorem Instates that a jam-or-listen adversary is still as powerful as the previously described 
online adversaries. This is interesting because a jam-or-listen adversary is in general weaker than an 
online adversary, since he never finds out the values of the symbols he corrupts. This theorem is a corollary 
of Theorems [H [2] and [3] as follows. The code constructions corresponding to the lower bounds are the same 
as in Theorems [H [2] and [3l As for the upper bounds, we note that the attacks described for Calvin in 
Theorems [H [2] and [3] actually correspond to a jam-or-listen adversary, and hence are valid attacks for 
this scenario as well. 

Outline The rest of the paper is organized as follows. In Section [2] we present a detailed description 
of our adversarial models together with some notation to be used throughout our work. In Section [3] we 
present the proof of Theorem [2l In Section [4] we present the main technical contribution of this work, 
the proof of Theorem [3l Theorem [H although stated first in the Introduction, follows rather easily from 
the proof of Theorem [3] and is thus presented in Section [B] of the Appendix. Theorem S] follows directly 
from Theorems [U O and El and is thus presented in Section [C] of the Appendix. Some remarks and open 
problems are finally given in Section [H The technical parameters of our results are summarized in Table [Tj 
of the Appendix. 



4 



2 Definitions and Notation 



For clarity of presentation we repeat and formalize the definitions presented earlier. Let g be a power of 
some prime integer, and let Fg be the field of size q. Throughout this work we assume that the field size 
q is exponential in poly(n) (although some of our results will only need a polynomial in n sized q) and 
that our parameters p and d are constant. For any integer i let [i] denote the set {1, . . . Let > be 
Alice's rate. An [n,nR]q-code is defined by Alice's encoder and Bob's corresponding decoder, as defined 
below. 

Alice: Alice's message u is assumed to be an element of [g"'^]. In our schemes, Alice will also hold a 
uniformly distributed secret r which is assumed to be a number of elements (say i) of [q] . Alice's secret is 
assumed to be unknown to both Bob and Calvin prior to transmission. Alice's encoder is a deterministic 
function mapping every {w, r) in [q"'^] x [qY to a vector x = {xi, . . . Xn) in F". 

Calvin/Channel: We assume that Calvin is online, namely at the time that the character Xi is 
transmitted Calvin has the knowledge of {xi}i^Ki- Here the knowledge set Ki is a subset of [i] that is 
defined below according to the different jamming models we study. Using his jamming function Calvin 
either replaces Alice's transmitted symbol xi in with a corresponding symbol or adds an error Cj to 
Xi such that Bob receives yi = Xi + Ci. 

In this work, Calvin's knowledge sets must satisfy the following constraints. Causality/d- delay: Calvin's 
knowledge set Ki is a subset of [i — dn]. Jam-or-listen: If Calvin is a jam-or-listen adversary, Ki is 
inductively defined so that it does not contain j <i such that yj ^ Xj. That is, Calvin has no knowledge 
of any Xi he corrupts. 

Calvin's jamming function must satisfy the following constraints. For each i, Calvin's jamming function, 
and in particular the corresponding error symbol Cj € F^, depends solely on the set {xjjjg/^. , Alice's 
encoding scheme, and Bob's decoding scheme. Additive/ Overwrite: If Calvin is an additive adversary, 
yi = Xi + ei, with addition defined over ¥q. If Calvin is an overwrite adversary, yi = e^. Power: Bob's 
received symbol yi differs from Alice's transmitted symbol Xi for at most pn values in [i]. 

Bob: Bob's decoder is a (potentially) probabilistic function solely of Alice's encoder and the received 
vector y. It maps every vector y = [yi, . . . yn) in F" to an element u' of [g"^]. 

Code parameters: Bob is said to make a decoding error if the message he decodes u' differs from that 
encoded by Alice, u. The probability of error for a given message u is defined as the probability, over Alice's 
secret r, Calvin's randomness, and Bob's randomness, that Bob decodes incorrectly. The probability of 
error of the coding scheme is defined as the maximum over all u of the probability of error for message u. 
Note that these definitions imply that a successful decoding scheme allows a worst case promise. Namely, 
it implies high success probability no matter which message u was chosen by Alice. 

The rate R is said to be achievable if for every e > 0, 6 > and every sufficiently large n there 
exists a computationally efficient [n, n[R — 5)]g-code that allows communication with probability of error 
at most £. The supremum of the achievable rates is called the capacity and is denoted by C . We denote 
the capacity of the d-delay online adversarial channels under the additive error model by C^^'^ip) and 
under the overwrite error model by C'^{p). For a jam-or-listen adversary we denote the corresponding 
capacities by C^^'^^^{p) and C^^'^^^{p). 

We put no computational restrictions on Calvin. This is because our proofs are information-theoretic 
in nature, and are valid even for a computationally unbounded adversary. However, our schemes provide 
computationally efficient schemes for Alice and Bob. 

Remark 2.1 We can allow Calvin to be even stronger than outlined in the model above. In particular, 
Calvin's jamming function can also depend on Alice's message u, and our Theorems and corresponding 
proofs are unchanged. The crucial requirement is that each of Calvin's jamming functions be independent 
of Alice's secret r, conditioned on the symbols in the corresponding knowledge set. That is, the only 
information Calvin has of Alice's secret, he gleans by observing x. 

Packets: For several of our code constructions (specifically those in Theorems [2] and [3]) , it is concep- 
tually and notationally convenient to view each symbol from Fg as a "packet" of symbols from a smaller 
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finite field F^/ of size q' instead. In particular, we assume (g')*" = q. Here m is an integer code-design 
parameter to be specified later. For a codeword x = xi, . . . ,Xni Alice treats each symbol (or packet) Xi 
in Fq as m sub-symbols Xi^i through Xi.m from Fg/. Similarly, she treats her secret r as m sub-symbols ri 
through Tm from Fg/ . 

3 Proof of Theorem [2] 

We consider block length n large enough so that d> 1/n. Throughout, to simplify our presentation, we 
assume that expressions such as pn or dn are integers. We first prove that 1—p is an upper bound on C^^^{p) 
by showing a "random-add" strategy for Calvin. Namely, consider an adversary who chooses elements of 
Fg uniformly at random and adds them to the first pn symbols in Alice's transmissions. Thus the first pn 
symbols Bob receives are uniformly distributed random elements of Fg, and carry no information at all. 
It is not hard to verify that such an adversarial strategy allows communication between Alice and Bob at 
rate at most 1 — p. This concludes our discussion for the upper bound. 

We now describe how Alice and Bob achieve a rate approaching \ — p with computationally tractable 
codes. Alice's encoding is in two phases. In the first phase, roughly speaking, she uses an erasure code to 
encode the approximately (1 — p)n symbols of her message u into an erasure-codeword v with n symbols. 
The erasure code allows u to be retrieved from any subset of at least (1 — p)n symbols of the erasure- 
codeword V. In the second phase, Alice uses n "short" random keys and corresponding hash functions 
to transform each symbol Vi of the erasure-codeword v into the corresponding transmitted symbol Xj. 
This hash function is carefully constructed so that if Calvin (a positive-delay additive adversary) corrupts 
a symbol Xj, with high probability Bob is able to detect this in a computationally efficient manner by 
examining the corresponding received yi. Bob's decoding scheme is also a two-phase process. In the first 
phase he uses the hash scheme described above to discard the symbols he detects Calvin has corrupted 
- there are at most pn such symbols. In the second phase Bob uses the remaining (1 — p)n symbols and 
the decoder of Alice's erasure code to retrieve her message. We assume Alice's erasure code is efficiently 
encodable and decodable (for instance Reed-Solomon codes [101 [T] can be used). In what follows we give 
our code construction in detail. 

Let q be sufficiently large (to be specified explicitly later in the proof). Let m = n^ + 2n. As mentioned 
in Section [21 Alice treats each symbol of a codeword x packet, by breaking each Xi into m 

sub-symbols Xj^i through Xi fji from Fg/. She partitions Xj i through Xi^rn into three consecutive sequences 
of sub-symbols of sizes n^, n and n respectively. The sub-symbols Xj^i through denoted by the 

set vji, and correspond to the sub-symbols of Vi, the ith symbol of the erasure-codeword v generated 
by Alice. The next n sub-symbols are denoted by the set r^, and consist of Alice's secret for packet i, 
namely, n sub-symbols chosen independently and uniformly at random from F^/. For each i, rj is chosen 
independently. The final n sub-symbols are denoted by the set cTj, and consist of the hash (or signature) 
of the information Wi by the function H^-^ ■ Here, H^^ is taken from a family Ti of hash functions (known 
to all parties in advance) to be defined shortly. All in all, each transmitted symbol Xi of Alice consists of 
the tuple {wi,ri,Hri{wi)). 

We now explicitly demonstrate the construction of each Wi from Alice's message u. Alice chooses 
R = (1 — 2n/m){l —p). Thus the message u she wishes to transmit to Bob has mnR = (m — 2n)(l —p)n = 
(1 — p)n^ sub-symbols over Fg/. Alice uses an erasure code (resilient to pn^ erasures) to transform these 
sub-symbols of u into the vector v comprising of n^ sub-symbols over ¥qi . She then denotes consecutive 
blocks of n^ sub-symbols of v by the corresponding WiS. More specifically, Wi consists of the sub-symbols 
in V in locations n^{i — 1) through n^i — 1. 

Before completing the description of Alice's encoder by describing the hash family we outline Bob's 
decoder. Bob first authenticates each received symbol yi = {w^^r[,a[) by checking that Hj.'{w'j) = a[. He 
then decodes using the decoding algorithm of the erasure code on the sub-symbols on w[ of all symbols yi 
that pass Bob's authentication test. 

We now define our hash family 7i and show that with high probability any corrupted symbol yi 7^ Xj 
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will not pass Bob's authentication check. More specifically, we study only corrupted symbols yi ^ Xi for 
which w[ ^ Wi. (If w[ = Wi, the erasure decoder described above will not make an error.) Let ej be the error 
imposed by Calvin in the transmission of the i'th packet xt. Hence for an additive adversary Calvin, a is 
defined by yi = Xi + ei. Analogously to the corresponding sub-divisions of xi and y^, we decompose ei into 
the tuple {wi,fi, di). In particular, we define the sets wi, fi and ai so to satisfy w[ = Wi + Wi, r[ = ri + ri and 
a'- = ai + di (addition is performed by element-wise addition over ¥qi of corresponding sub-symbols in each 
set). For Bob to decode correctly, the property that yi fails Bob's authentication test \iwi^{) needs to be 
satisfied with high probability. More formally, noting that rj is not known to Calvin and thus independent 
of Wi, we need for all i and all such that Wi ^ 0, that Pr^J-ffr' (^D = '^i I Hri{wi) = ai] is sufficiently 
small. Or equivalently, Prr.[-ffr-i+f,('«^i + w>i) = CTi + ai \ Hniwi) = ai] = PirA^ri+rii'^i+'^i) - ^nim) = ^i] 
is sufficiently small. 

To complete our proof we present our hash family Tl. Recall that Wi consists of sub-symbols in 
Fq/. Let Wi represent Wi when arranged as a n x n matrix. Let rj be a column vector of n symbols 
corresponding to rj. We define the value of the hash Hr^Wi) as the length-n column vector defined as 
WiTi. Thus for the corresponding errors Wi ^ 0, ri,(Ti defined above, H-i-.^f^Wi + Wi) — Hr^{wi) = di iff 
{Wi + Wi){Yi + fi) — {WiTi) = Gi. Here Wi is the matrix representation of Wi and fi, a\ correspond to r^, (jj. 
Namely, the corrupted symbol received by Bob is authenticated only if WiVi = a\ — {Wi + Wi)r\. 

For Calvin to corrupt Alice's transmission, we assume that 7^ or equivalently Wi 7^ 0, therefore 
the rank of Wi is at least 1. Now, in WiVi = 5"i — {Wi + Wj)fi, the left hand side depends on while the 
right hand side does not. Hence the equation is satisfied by at most {q')^~^ values for the vector r^. Since 
Fj is uniformly distributed over (F^/)" and unknown to Calvin, the probability of a decoding error is at 
most 1/q' = o(n~^) if q' is chosen to be n • w(l). 

All in all, our communication scheme succeeds if each corrupted symbol with Wi ^ fails the authen- 
tication test. This happens with probability at least 1 — n/q' = 1 — o(l) as desired. Taking m = n'^ + 2n 
the rate of the code is (1 — o(l))(l — p) and the field size needed is (q'Y"' = exp(poly(n)). ■ 

4 Proof of Theorem [3] 

Proof of Upper bound: We start by addressing the three cases in the upper bound on the capacity 
C^"(p). First, if p < d, Calvin corrupts the first pn symbols uniformly at random as in the proof of 
Theorem[2]to attain an upper bound of 1 —p on the achievable rate. Second, if p > 1/2 and the rate R > 
is positive, Calvin picks a codeword x' uniformly at random from Alice's codebook. With probability at 
least 1 — Alice's true codeword x is distinct from the codeword x'. Calvin then ffips an unbiased coin, 

and depending on the outcome he corrupts either the first half or the second half of x. This corruption 
is done by replacing the symbols of x by the corresponding symbols of x'. If indeed x 7^ x', Bob has no 
way of determining whether Alice transmitted x or x'. Thus, Bob's probability of decoding incorrectly is 
at least ^(1 — q~^^) > \ for large enough q and/or n. 

Finally, if d < p < 1/2, we present a "wait-and-attack" strategy for Calvin to prove that 1 — 2p -|- d 
is an upper bound on C^™^(p). Suppose not, and that rate R = 1 — 2p + d + e is achievable for some 
e > 0. Then there are g^" possible messages in Alice's codebook. Calvin starts by eavesdropping on, but 
not corrupting, the first {R — e)n symbols Alice transmits. He then overwrites the next dn symbols with 
symbols chosen uniformly at random from Fg. These dn locations convey no information to Bob. At this 
point (after Alice transmits {R + d — e)n symbols), the d-delay Calvin only knows the value of the ffi'st 
{R—e)n symbols of x. It can be verified that with probability at least 1 — g"^"/^ over Alice's codebook, after 
Alice's ffist {R + d — e)n transmitted symbols, the set S of codewords consistent with what Bob and Calvin 
have observed thus far is of size at least g*"""/^. Calvin then picks a random x' from S. With probability 
at least 1 — g~^"/^, x' is distinct from Alice's x. Calvin then flips an unbiased coin, and depending on the 
outcome he corrupts either the first half or the second half of the remaining (1 — {R + d — e))n = 2{p — d)n 
symbols of x. This corruption is done by replacing the symbols of x by the corresponding symbols of x'. If 
indeed x 7^ x'. Bob has no way of determining whether Alice transmitted x or x'. Thus Bob's probability 
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(over the message set and over the choice of Calvin) of decoding incorrectly is at least ^(1 — g"^*^/^)^ > j. 

Proof of Lower bound: We now prove that the rate C'^{p) specified in Theorem[3]is indeed achievable 
with a computationally tractable code. The scheme we present covers all positive rates in the rate-region 
specified in Theorem [3l i.e., whenever p < 1/2. In particular the rate R of our codes equal 1 — p if 
d > p, and equals 1 — 2p + d d < p. Our scheme follows roughly the ideas that appear in the scheme 
of Section [31 Namely, Alice's encoding scheme comprises of an erasure code along with a hash function 
used for authentication. However, in general, an overwrite adversary is more powerful than an additive 
adversary, because it can be directly shown that an overwriting adversary can substitute any symbol Xj by 
a new symbol yi that can pass the authentication scheme used by Bob in Section [3l We thus propose a 
more elaborate authentication scheme in which each symbol Xj contains information that allows for pairwise 
authentication with every other symbol Xj. 

Using notation similar to that of Section [3l let u be the message Alice would like to transmit to Bob, 
and V = Vi, . . . ,Vn be the encoding of u via an efficiently encodable and decodable erasure code (here we 
use Reed-Solomon codes). Let q be sufficiently large (to be specified explicitly later in the proof). Let 
m = n'^ + 1r? (note that this is significantly larger than in Theorem [2|) . As mentioned in Section [21 Alice 
treats each symbol of a codeword x = xi, . . . , x„ as a packet, by breaking each Xj into m sub-symbols Xj^i 
through Xi^rn from F^/. She partitions Xj^i through Xj^m into three consecutive sequences of sub-symbols 
of sizes n^, re" and v? respectively. The sub-symbols Xi \ through denoted by the set Wj, and 

correspond to the sub-symbols of Uj, the ith symbol of the erasure-codeword v generated by Alice. The 
next r? sub-symbols are arranged into n sets of v? sub-symbols each, denoted by the sets r^j for each 
j S [n], and consist of Alice's secret for packet i. That is, each rjj consists of v? sub-symbols chosen 
independently and uniformly at random from F^/. For each i and j, rjj is chosen independently. The final 
rc' sub-symbols arranged into n sets of r? sub-symbols each, denoted by the sets (jjj for each j G [n], and 
consist of the pairwise hashes of the symbols Xj and Xj. We define Oij to be Hr^jiwj), where i^r^j is taken 
from (a slight variation to) a pairwise independent family Tt (known in advance to all parties). Namely, 
(Tjj is the hash of the information from xj using a key from the transmitted symbol Xj. All in all, each 
transmitted symbol Xj of Alice consists of the tuple {wi, {rij}j, {Hr^^{'Wj)}j). Here j = 1, . . . ,n. 

We now explicitly demonstrate the construction of each Wi from Alice's message u. Alice chooses 
R = {1 — (2n^)/m)C, where C is an abbreviation of the capacity C'^{p) specified in Theorem [3l Note 
that R equals C asymptotically in n and m. Thus the message u she wishes to transmit to Bob has 
mRn = [m — 2n^)Cn = Cn^ sub-symbols over Fg/. Alice uses an erasure code (resilient to (1 — C)n^ 
erasures) to transform these sub-symbols of u into the vector v comprising of sub-symbols over Fg/. 
She then denotes consecutive blocks of sub-symbols of v by corresponding wis. More specifically, Wi 
consists of the sub-symbols in v in locations n^(i — 1) -|- 1 through n'^i. Here i = 1, . . . , n. 

The remainder of the proof is as follows. We first discuss the property of the family 71 of hash functions 
in use, needed for our analysis. We then describe and analyze Bob's decoding algorithm. 

As mentioned above we use a (variation to a) pairwise independent hash family Ti = {H^} with the 
property that for all w'^ ^ Wj, the probability over rij that Hrijiwj) equals Hrij{wj) is sufficiently small. 
Such functions are common in the literature (e.g., see [HI [7]). In fact, we use essentially the same hashes 
as in Theorem [21 except with different inputs and dimension. Namely, let Wi and Wl represent Wi and u;^ 
respectively arranged as x matrices. Let r^j be a length-n^ column vector of symbols corresponding to 
rij. We define the hash Hri.{wj) as the column vector cjij = WiVij. Note that Hr^^{w'j) = Hr^^{wj) means 
that WjTij = WjYij, which implies that (Wj — Wj)vij = 0. But by assumption w'^ ^ wj, so Wj ^ Wj, and 
so Wj — Wj is of rank at least 1. Thus a random rij satisfies (Wj — Wj)rij = with probability < l/q' ■ 

We now define Bob's decoder. Let Xj, Xj be two symbols transmitted by Alice, and yi, yj be the 
corresponding symbols received by Bob. Consider the information Wi, the secret rij and the hash value aij 
in Xj, and let w'^, r[- and a[j be the corresponding (potentially corrupted) values in yi. Similarly consider 
the components of Xj and yj. Bob checks for mutual consistency between yi and yj. Namely, the pair yi 
and yj are said to be mutually consistent if both o"^ • = H^i ^ {w',) and c'-j = Hyi ,{w[). Clearly, if both yi and 
yj are uncorrupted versions of Xj and Xj respectively, they are mutually consistent. By the analysis above 
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of Hrij , if Calvin does not know the value of r , does not corrupt Xi but corrupts Wj , then the probability 
over rij that yi and yj are consistent is at most 1/q'. This is because a'^j = aij = Hnj^Wj), r'^j = rij, and 
w.h.p. Hr^.{wj) / Hr-.{wj). We conclude: 

Lemma 4.1 With probability at least 1 — 1/q' , the following yi and yj are mutually inconsistent, (i) 
Causality: If i > j, Xi = yi and w'j 7^ Wj. (ii) d-delay: // \i — j| < dn, and Calvin corrupts exactly one of 
the symbols xi and xj so that either Wi 7^ w ■ or wj 7^ Wj . 

Bob decodes via the d-Delay Online Overwriting Disruptive Adversary Decoding (d-DOODAD) Algo- 
rithm, described in detail below. We first give a high-level overview of the three major steps of d-DOODAD. 
Bob's first step is to test pairs of received symbols {yi,yj) for mutual consistency. In particular he consid- 
ers only pairs of symbols separated by at most dn locations; in this event Lemma 14.1( 11) implies that Bob 
detects the corruption of exactly one of a pair of symbols with high probability. 

Based on the 0{dn'^) tests in the first step, in the second step he enumerates subsets of {yi, . . . 
of received symbols as "candidate subsets" for decoding via Alice's erasure code. In particular, each of 
the candidate subsets satisfies the natural property that it contains at least (1 — p)n mutually consistent 
yj's. Naively, this enumeration seems computationally intractable since there may be as many as 
such sets. However, there is also a more intricate combinatorial property (Step 2(c) in the d-DOODAD 
algorithm below) that candidate subsets must satisfy; we discuss this property after presenting the details 
of the algorithm. The effect of Step 2 below is to drastically curtail the number of candidate subsets that 
Bob needs to consider, to at most n^^'^, hence ensuring that this step is still computationally tractable. 

In the third step, for each of the candidate subsets generated in the previous step, Bob uses the decoder 
for Alice's erasure code to generate a set of linear equations that the sub-symbols of her message u must 
satisfy. Then we claim that any candidate subset that has even one corrupted symbol must generate a 
set of inconsistent linear equations. Hence Bob decodes by using the decoder for Alice's erasure code on 
the unique candidate subset that generates a consistent set of linear equations. As we will see, the error 
probability of our scheme will be n'^/q', which is o(l) if we set q = exp(poly(n)). 

The details of d-DOODAD now follow. We define a connected component Qi of an undirected graph 
^ as a connected subgraph of Q such that there is no edge in Q between any vertex in Qi and any vertex 
outside it. Also, let C be the linear transform of the Reed-Solomon code that takes the length-Cn^ column 
vector u of Alice's message u to the length-n^ column vector of the erasure codeword v. Hence £u = v. 
Let the column vector of sub-symbols corresponding to v in the transmission Bob receives be denoted w'. 
For any subset X C [n^] of size Cn^, let Cj^, w-j and ^'-j be respectively defined as the restriction of C to 
the ith rows/indices of £, v and w' respectively, for all i ^I. 

d-Delay Online Overwriting Disruptive Adversary Decoding (d-DOODAD) Algorithm : 

1. Bob constructs a d-distance mutual consistency graph Q with vertex set {yi, . . . , y^} and edge-set 
comprising of all mutually consistent pairs {yi,yj) such that \i — j\ < nd (but no other edges). Thus 
Q comprises of £ < n connected components {^1, . . . Qi}. 

2. Let /C be a subset of [£]. We define the candidate subset C{}C) of Q as the set {Qk\k € /C} of connected 
components in Q. If the size of K, is j, we say C(/C) has size j. Bob enumerates all possible candidate 
subsets C(/C) of Q such that (a) The candidate subset C(/C) has size at most c = p/d. (b) The number 
of vertices in the subgraphs in C(/C) is at least (1 — p)n. (c) Each pair of vertices yi and yj in the 
union of the subgraphs in C(/C) are mutually consistent. 

3. Let K, C [n^] be the set comprising of indices in w' corresponding to all symbols yj in the components 
C(/C). Bob picks an arbitrary subset T <Z K, oi size Cn^ . If (^{^z) ^ ^'l) ~ ^'/C' decodes u 
as the sub-symbols in the vector C^^' j^. Otherwise he discards K, and returns to the beginning of 
Step El 

Claim 4.1 The d-DOODAD algorithm decodes Alice's message correctly with probability at least l — n?/q'. 
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Proof: Throughout we assume that Lemma |4. 1 1 holds for all corresponding yi and yj (by the union bound 
this happens with probability at least 1 — n^/q'). Thus corrupted yi and uncorrupted yj are non-adjacent 
in Q. We first prove that at least one C(/C) with only uncorrupted symbols satisfies Steps [2] and El 
We examine the three conditions of Step [2l By the definition of mutual consistency any set with only 
uncorrupted symbols satisfies Step 2(c). Since Calvin can corrupt at most pn symbols, there must be some 
C{JC) satisfying Step 2(b). To prove that C(/C) also satisfies Step 2(a), we observe the following. If Calvin 
does not corrupt at least dn consecutive symbols between two uncorrupted symbols yi and yj (say lij), there 
must be a sequence of at most j — i + 1 uncorrupted symbols with indices i = ko < ki < k2 < ■ ■ ■ < kj-i = j 
such that any two consecutive symbols in the sequence have indices that differ by less than dn. Then by 
the definition of Q, both yi and yj must be in the same connected component of Q. But there are at most 
pn corrupted symbols, hence there are at most c = p/d disjoint sequences of nd consecutive corrupted 
symbols (and thus at most c components in C{IC)). 

Lastly, we show that any C(/C) with only uncorrupted symbols and satisfying Step 2 must also satisfy 
Step 3. To see this, note that any such C(/C) has at least {l—p)n symbols from Fg. Thus, by the definitions 
of m and C for Theorem [3l C(/C) has at least (1 — p)n^ > Cn^ uncorrupted sub-symbols over F^/. Also, 
since C{IC) comprises solely of uncorrupted symbols, w'^ = v^, hence for any I, w'-j = vj. But by the 

properties of erasure codes, jCj^vj = u, Alice's message vector. Thus Cj^ (y'^Z^^'l) ~ ^jC^ ~ ^K, ~ K,' 
We now show that there does not exist any C{JC') such that the corresponding output of the d-DOODAD 
algorithm u{C{IC')) differs from Alice's real message u. We prove this by contradiction. Suppose a C(/C') 
passes all the decoding steps of the d-DOODAD algorithm and results in a u{C{K,')) distinct from Alice's 
message u. We now make a series of observations that successively refine the structure of such a C{JC'), 
resulting in the conclusion that, w.h.p., C{K,') contains no uncorrupted symbols, and therefore u{C{IC')) = u. 

First, note that C{JC') must contain uncorrupted symbols to pass Step 2(b), since p < 1/2. In addition, 
to pass Step 2(c), by Lemma [4.1( i). all the uncorrupted symbols of C(/C') must come before all the symbols 
corrupted by Calvin. Now notice that the uncorrupted and the corrupted symbols in C{JC') must be 
separated by a separating set TZ of at least nd consecutive symbols not in C(IC'). If not. Lemma l4.ir ii) 
would imply that w.h.p. C{IC') does not satisfy Step 2(c) of d-DOODAD. Now note that the separating set 
TZ must contain at least dn consecutive symbols corrupted by Calvin. This follows from the fact that C(/C') 
consists of connected components. Namely, if TZ contains less than dn corrupted symbols, there must exist 
an uncorrupted symbol yi and a corrupted symbol yj, both in C{IC'), satisfying |j — i| < dn. But this by 
Lemma HTTT ii) would contradict Step 2(c). Notice that if d > p we may conclude our proof at this point. 

We now observe that there are at most {p — d)n corrupted symbols in C{IC'). This follows from the fact 
that TZ contains dn consecutive symbols corrupted by Calvin (not in C(/C')), and the fact that Calvin can 
corrupt at most pn symbols. This, together with Step 2(b) of d-DOODAD, implies that the component 
set C(/C') contains a proper subset C(/C") with at least Cn uncorrupted symbols. Finally, let X be any 
subset of Cn^ uncorrupted sub-symbols in C(/C"). Let T' be any other subset of Cn^ symbols in C{JC"). 
Consider the corresponding message vectors u = Cj^^'j and u' = C~}w'j> that Step E] of d-DOODAD 

may decode to. Since /C' is of size at least (1 — p)n^, by the property of erasure codes [6], if u' 7^ u, then 
£^/u' 7^ ^j^'"- Thus Cj^, (^C^jw'j'^ 7^ Cj^, (^^j^'^'j^ = ^IC'^ ~ ^'iC'' contradicting Step 3. ■ 



5 Conclusion 

In this work we characterize the capacity of online adversarial channels and their variants under the 
additive and overwrite error models. Our results are tight and coding schemes efficient. Throughout, we 
assume that the communication is over a size q alphabet, assumed to be large compared to the block-length 
re. An intriguing problem left untouched in this work concerns communication in the online adversarial 
setting over "small", e.g. binary, alphabets. The authentication schemes used extensively in this work 
depend integrally on the the alphabet size being large. They do not extend naively to the binary alphabet 
case, where new techniques seem to be needed. 
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A List of parameters of our codes 





Capacity 


Minimum q 


Complexity 


Probability of Error 


Theorem 1 


1 - 2p 


q> n 


O (n^ lognlog"* q) 





Theorem 2 


1 — p 




O (n^ log n log^ q) 




Theorem 3 


d <p <0.5 


l-2p + d 




O (nP/'^+2 log n log^ q) 






p < d,p < 0.5 


1 — p 




O (n^ log n log^ q) 





Table 1: Bounds on the capacity C, alphabet size q required to achieve capacity, computational complexity, 
and probability of error, of our main results. The bounds are in terms of the parameters p (adversary's 
power), d (adversary's delay), n (block-length), q (field-size), and 5 (difference between the C and rate R). 



Table [T] is obtained by careful analysis of the parameters of the algorithms corresponding to Theorems [H 
[2]and[3l The corresponding values for the scenarios in Theorem H] are omitted since they are element-wise 
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identical to those in the table. The values in Table [T] substitute the rate-overhead parameter S for the 
packet-size parameter m used in the proofs of Theorems [2] and [3] since we feel this choice of variables is 
more "natural" when examining the tradeoffs between code parameters. Also, the algorithms presented 
in the proofs of Theorems [2] and [3] correspond to a particular setting of the 6 parameter; we omitted this 
degree of freedom in the presentation of the proofs, for ease of exposition. Lastly, no effort has been made 
to optimize the tradeoffs between the parameters in Table [TJ in fact, we have preliminary results on schemes 
that improve on some of these parameters (work in progress) . 

B Proof of Theorem [1] 

As discussed in the Introduction, the lower bound of Theorem [1] follows from known constructions |10^ [T]. 
To complete the proof, then, all that is needed is a corresponding upper bound on the capacity. The 
required upper bound is novel. However, it is a special case of upper bound of Theorem [31 and follows 
directly if the parameter d in the corresponding proof is set to zero. 

C Proof of Theorem [4] 

In the j am-or-listen online model, Calvin is assumed to be unaware of the value of the symbols Xi that he 
corrupts. Theorem Estates that a j am-or-listen adversary is still as powerful as the previously described 
online adversaries, and is actually a corollary of Theorems [H [5] and O First of all, the code constructions 
corresponding to the lower bounds are the same as in Theorems [H [2]and[3l As for the upper bounds, it 
is not hard to verify that the attacks for Calvin outlined in each of the settings addressed in the paper 
correspond to a j am-or-listen adversary, and hence are valid attacks for this scenario as well. 
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